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1. INTRODUCTION 

Drowsiness during driving is a severe problem and is believed to be a direct contributing cause of 
traffic accidents [1, 2]. It places the lives of drivers and passengers at risk and can cause serious accidents on 
major roads. According to a U.S. National Highway Traffic Safety Administration (NHTSA) report in 
2017 [3], drowsiness and falling asleep while driving was responsible for at least 100,000 automobile crashes 
and 846 deaths within a year. The National Police Agency of Japan also released data showing that 
approximately 434,000 traffic accidents occurred in 2017 [4]. Previous studies theorized that the causes of 
accidents might be related to factors such as lack of concentration during driving and poor driving skills. 
However, those shortcomings can be rectified by improving driver awareness and driving skills. 

Various methods for detecting drowsiness have been proposed. Among the most popular is 
implementing a trajectory sensor inside the target vehicle [5, 6]. This sensor measures the magnitude of the 
steering wheel angle and its velocity, as well as the frequency with which the drowsy driver correctly 
positions the steering wheel angle. Placing the sensor inside the vehicle is more convenient for the driver 
instead of attaching it to the driver directly. However, the road surface and condition may reduce the 
detection accuracy of the sensor. 

We theorize that drowsiness itself is strongly related to the physical condition of a person, 
and therefore, drowsiness detection can be improved by directly investigating this condition. Fundamentally, 
the actual state of the human body is usually determined by placing electrodes or bio-sensors on the body 
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itself. However, several previous studies have reported another approach for detecting drowsiness that 
involves using human biological signals, such as eye movements and eye blinking obtained using an 
electrooculogram (EOG) [7, 8], heartbeats using an electrocardiogram (ECG) [9-11], brain activity using an 
electroencephalogram (EEG) [12-14], monitoring muscle activity using an electromyogram (EMG) [15, 16], 
and also pulse rate activity [17]. However, skin contact by electrodes or a bio-sensor could cause driver 
discomfort during driving. Another appropriate method for detecting and estimating the drowsiness state of a 
driver with minimal or no skin contact is therefore needed. 

An alternative method of estimating drowsiness is by using a camera to record eye behavior. 
Previous studies reported that it is possible to detect drowsiness using numerous less-intrusive techniques and 
minimizing skin contact by placing a camera in front of the driver to capture the face and eye. For instance, 
eyelid movement [18], gaze and head [19, 20], eye tracking and pupil position [21], face expression detection 
[22], face expression monitoring [23], blink detection [24], eye state analysis [25, 26], portion of eye closure 
[27], and eyelid closure [28] have been investigated. These methods had the same goal of providing 
information related to the subject/driver condition while performing various tasks or under various conditions 
(e.g., rest, fatigue, and drowsiness). However, although previous approaches could estimate the drowsy 
condition, there were drawbacks such as the necessity to provide a clear view and stable positioning of the 
camera during the recording process. 

Our proposed system employs an eye tracker sensor mounted on the head to obtain eye properties 
during driving. We confirmed that this kind of arrangement has rarely been used to date, even though a head- 
mounted eye tracker can overcome view and position limitations while evaluating the driver’s eye properties. 
As previously described, most studies used the subject’s eye and facial movement images to evaluate their 
condition, especially drowsiness. However, we could not find any clear information on how to utilize the 
gazing of the driver to determine his/her condition. In this study, we focused on eye-gazing because of the 
limited extent to which it has been utilized. To evaluate the drowsiness condition, we evaluated the subject’s 
condition using facial expression evaluation (FEE) [29] in accordance with the experiment’s location and 
environment. We utilized this evaluation method with the objective of observing the actual condition of the 
subject by considering several points of view with the same source information. 

Thus, we estimated the relationship between drowsiness and gazing parameters in three categories 
of drowsiness. We hypothesized that these three categories have a strong relationship with the eye-gazing 
properties, especially for estimating the condition before actual drowsiness to prevent accidents. 
We investigated whether each feature of the gazing properties has a significant difference and examined the 
performance of related features using a support vector machine (SVM). Finally, we confirmed whether the 
gazing signal could be used as an actual parameter to assess drowsiness while driving. 


2. METHODS 
2.1. Subjects 

Eleven healthy males of ages in the range of 21-35 years participated in this study. Before the 
experiment, written informed consent for this study was obtained from each participant. The participants 
were asked to get sufficient sleep during the night and have their lunch before participating in the experiment. 
They were also asked not to consume alcohol or caffeine before the experiment. 


2.2. Tasks 

We used a driving simulator (DA-1110, Honda Motor, Japan), eye gaze tracker (TalkEye Lite, 
Takei Scientific Instruments, Japan), web camera (HD Pro Webcam C920, Logicool, China), computer to 
record the driver’s facial expression, and driving simulator system control, as shown in Figure 1. 
Each subject was asked to drive on the oval track without obstacles during the daytime in an automatic 
transmission car while maintaining a speed of 100 km/h for 50 minutes. The experiment was scheduled twice 
per day from 8:00 am to 10:00 am and from 1:00 pm to 3:00 pm, respectively. Thus, each subject participated 
in eight trials during the experiment on different days. All procedures used in this study were approved by the 
Ethical Committee of the Faculty of Advanced Science and Technology, Kumamoto University. 


2.3. Recordings 
2.3.1 Physiological Measurement 

We mounted the eye gaze tracker, as shown in Figure 2, on the head to obtain and record the eye 
gaze signal at a sampling rate of 30 Hz. In addition, the subject’s face was recorded using the web camera for 
psychological measurement. 
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Figure 1. Research environment Figure 2. Eye gaze tracker 


2.3.2 Psychological Measurement 

FEE is performed with evaluation from different perspectives by evaluators. The evaluators must 
evaluate the subjects’ state with the same facial expression recording source during driving. Evaluators also 
need to match their judgment while providing the evaluation and strive for the perception of different 
evaluators to be relatively similar. The matching of perceptions is carried out jointly at the end of the 
evaluation. Therefore, we expected FEE to provide a reliable measure of the condition of the subject. 

FEE was used to evaluate each subject’s drowsiness condition. It consists of a five-level (0-4) 
drowsiness questionnaire, in which every number represents a drowsiness degree, from very alert to very 
sleepy. On completion of the experiment, four examiners evaluated each subject’s drowsiness condition 
every epoch (1 epoch = 30 s), according to the FEE questionnaire shown in Table 1, by watching the video of 
the subject’s facial expression recorded by the web camera while the subject was driving. The final FEE 
evaluation score was decided by the majority vote of the four examiners. 

Before and after each trial, each subject was instructed to maintain a resting state in the driving 
simulator’s seat for 5 min. Then, the subjects were asked to drive for 50 min in the driving simulator, 
and a video of each subject’s face recorded. 


Table 1. Facial Expression Evaluation and its Criteria 








Grade Drowsiness stage Action criteria 
0 Not drowsy Quick and frequent eye shift, active body movement 
1 Somewhat drowsy Open lip, slow eye movement 
2 Drowsy Slow and frequent eyeblink, mouth movement 
3 Quite drowsy Conscious eyeblink, head shake, frequent yawn 
4 Very drowsy Close eyelid, head tilt forward or fall behind 





2.4. Analyses 
2.4.1 Feature Extraction 

Before extracting gazing properties, the threshold to judge the gazing had to be determined. Gazing 
was considered as a feature when the moving speed was maintained below the considered threshold. 
To confirm the optimum threshold, we calculated the number of frames in which gazing occurred (1 frame = 
1/30 s) per epoch by using the minimum and maximum threshold. By considering the maximum sum of the 
differences (SOD) value as the optimum threshold candidate, we calculated the SOD of the frames in which 
gazing occurred, as shown in Figure 3. In this experiment, a threshold of 2—3 deg/s was found to be the 
maximum SOD value. According to that condition, we therefore chose 2 or 3 deg/s as our final SOD 
threshold candidate. During this experiment, based on this data, the left side of the threshold candidate was 
the threshold 1—2 deg/s and the right side was the threshold 3=4 deg/s. If the neighboring gap from the 
maximum SOD value was closer to the left, then 2 deg/s became the optimum threshold; otherwise, if the gap 
from the maximum SOD of the candidate of the final threshold value was closer to the right, then a threshold 
of 3 deg/s became the optimum threshold. In this case, from the optimum SOD candidate, a threshold of 2 
deg/s was considered as the optimum threshold. Therefore, we used a threshold value of 2 deg/s as our gazing 
occurrence threshold. Every subject had a different optimum threshold in each trial. In a total of 88 trials, 
there were 31 trials with an optimum threshold of 2 deg/s, 30 trials with an optimum threshold of 3 deg/s, 
and 27 trials with an optimum threshold of 4 deg/s. 
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The gazing signal was generated by moving speed. On obtaining the optimum threshold, 
we constructed and extracted the gazing signal, as shown in Figure 4. When the moving speed was less than 
the threshold and more than zero, gazing occurred. In contrast, when the moving speed was greater than the 
threshold, there was no gazing. When the moving speed was zero, blinking occurred. In this example, 
a threshold of 2 deg/s was used. The number of gazing occurrences, blinking occurrences, or non-gazing 
occurrences could be calculated on a frame by frame basis. The continuous occurrences during a certain 
period were counted as a cluster. Nine features of the gazing signal, listed in Table 2, could be extracted and 
computed every epoch. The process was repeated for all the features in each trial. 


Table 2. Features Extracted from Gazing Signal 








Parameter Abbrev Feature 
Gazing frame GF Number of frames in which gazing occurred per epoch 
Gazing cluster GC Number of clusters in which gazing occurred per epoch 
Non-gazing frame NF Number of frames in which non-gazing occurred per epoch 
Non-gazing cluster NC Number of clusters in which non-gazing occurred per epoch 
Blink frame BF Number of frames in which blinks occurred per epoch 
Blink cluster BC Number of frames in which blinks occurred per epoch 
Ratio of gazing frames vs. clusters RG GF/GC 
Ratio of non-gazing frames vs. clusters RN NF/NC 
Ratio of blink frames vs. clusters RB BF/BC 





2.4.2. Statistics 

Before conducting statistical analysis, we investigated whether each feature correlated with the 
condition of the subject by using FEE. Then, a Kolmogorov—Smirnov test was used to examine whether the 
gazing signal showed a normal distribution. One-way ANOVA analysis was used if the distribution data had 
a normal distribution; otherwise, Wilcoxon-rank sum analysis was used. The results of the feature extraction 
process were divided into three categories: alert (FEE = 0), lightly drowsy (FEE = 1-2), and heavily drowsy 
(FEE = 3-4). After dividing the gazing signal into these three categories, statistical analysis was performed 
to investigate the significant differences within the three categories. A value of p < 0.05 was considered to be 
statistically significant. 


2.4.3. Classification 

In our study, an SVM was used as a classifier to conduct performance evaluation of the features in 
the three categories—alert (FEE = 0), lightly drowsy (FEE = 1-2), and heavily drowsy (FEE = 3-—4)—by 
using the LIBSVM library [30], which has also been utilized by Akbar et al. [31]. The features were first 
combined into one dataset; then, one half was used to make the training data and the other half the testing 
data (training set: 50%, test set: 50%). The average percentage of total true detection from 4-fold cross- 
validation was used as a measure of classification accuracy. A radial basis function (RBF) was used as the 
SVM kernel function. The best value of cost and gamma parameter of the RBF kernel was set automatically 
by using LIBSVM. 

To optimize the classification process, we used the SVM recursive feature elimination (RFE) 
method for each subject, which is a wrapper-based method. The SVM RFE was developed by Guyon et al. 
[32] and has been used in gene selection for cancer classification, and by Ebrahimi et al. [33] for automatic 
sleep staging. The steps in the SVM RFE feature selection algorithm used in this study were as follows: first, 
one feature was removed, and the accuracy computed. Subsequently, the feature that contributed to the 
highest accuracy was eliminated. The feature eliminated in the previous step could be used in the next step. 
This operation was repeated for every feature removed. The feature eliminated first was considered as the 
worst contributing feature and the feature eliminated last was considered as the best-contributing feature. 
The final step was to sort the features from best to worst, then compute the accuracy from the best features 
combination of each subject. 


3. RESULTS 

Figure 5 shows the relationship between features according to the condition represented by FEE. 
Gazing frame (GF), gazing cluster (GC), non-gazing cluster (NC), and the ratio of GF/GC (RG) show a 
decreasing tendency with increasing FEE. In contrast, blink frame (BF), blink cluster (BC), the ratio of 
BF/BC (RB), and the ratio of non-gazing frame (NF)/NC (RN) show an increasing tendency with increasing 
FEE. NF does not show any tendency and is inconsistent with FEE. 
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Table 3 summarizes the typical statistical results of all the features in the three categories: 
alert, lightly drowsy, heavily drowsy. It can be seen that the GF, GC, and NC exhibit a statistically significant 
difference (p < 0.001; Kruskal—Wallis test) according to the differences in each category followed by the 
decreasing trend as well. BF, the ratio of BF/BC (RB), and the ratio of NF/NC (RN) also exhibit a 
Statistically significant difference (p < 0.001; Kruskal—Wallis test) according to the differences in each 
category followed by the increasing trend as well. In another case, even though BC and the ratio of GF/GC 
(RG) tended to FEE during the drowsiness state, these parameters were not considered to have a significant 
difference in any of the drowsiness categories. Moreover, the NF also has no significant difference owing to 
its inconsistent tendency to FEE during the drowsiness state. Regarding the statistical results in these three 
classes, we obtained the results for both increased and decreased properties to represent the drowsiness 
condition, especially for category 2, which describes the state before becoming drowsy. 
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Figure 3. SOD of frames in which gazing occurred 
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Figure 4. Production of gazing from moving speed for feature extraction 
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Figure 5. Features tendency according to FEE 


Table 3. Statistical Results of all Features, Three Categories: Alert, Lightly Drowsy, Heavily Drowsy 








Parameter Alert Lightly drowsy Heavily drowsy 
GF 211.39 + 50.86 165.88 + 59.91 *** 112.34+ 58.34 ***, HHH 
GC 128.33 + 21.78 98.63 + 30.28 *** 69.08 + 30.61 ***, Ht 
NF 658.36 + 50.10 667.54 + 76.45 *** 627.28 + 109.76 HH 
NC 133.67 + 20.68 106.06 + 28.84 *** 78.45 + 28.13 ***, HHH 
BF 22.06 + 26.74 36.15 + 32.86 *** 128.31 + 114.43 ***, HHH 
BC 5.16+ 3.39 754+ 4.78 10.07 + 5.93 ***, HHH 
RG 1.63 + 0.18 158+ 0.16 148+ 0.19 ***, HHH 
RN 5.10+ 1.20 6.864 2.39 *** 9.01+ 3.69 ***, HHH 
RB 4.82+ 1.41 9.95 + 4.66 *** 21.18 + 17.02 ***, Ht 





*** F< 0.001 vs. alert, ## p < 0.001 vs. lightly drowsy; all values are expressed as mean + SD 


The results of the three categories show that the gazing parameters could be used to estimate 
drowsiness. However, several subjects did not show any significant differences corresponding to the 
psychological measurements using FEE in all categories. We assumed that it was caused by the differences in 
perception during the examiners’ evaluation while examining the subjects’ physical state during driving and 
when watching the video recording of the subjects driving as well. 

We used all nine parameter features during the classification analysis. Table 4 shows that the SVM 
was able to detect the drowsiness with an overall accuracy of 76.3% in the three categories of state: alert, 
lightly drowsy, and heavily drowsy. 


Table 4. Classification Results of all Features for the Three Categories: Alert, Lightly Drowsy, 











Heavily Drowsy 
Subject Accuracy [%] Best combination 
1 85.4 NC, BF, BC, RG 
2 66.2 GF, GC, NF, BC, RG, RN 
3 69.2 GF, GC, NF, NC, BC 
4 68.0 NC, BF, RG, RN 
B) 63.8 NF, BF, BC, RG, RN 
6 73.8 GC, NF, NC, BF, BC, RG, RN 
7 78.3 GF, GC, NF, BF, BC, RG, RN 
8 85.8 GC, NF, BF, BC, RG, RN 
9 771.6 GF, GC, NF, NC, BC, RG, RN, RB 
10 87.6 RB 
11 83.4 GF, GC, NF, BF, BC, RG, RB 
Overall 76.3 
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4. DISCUSSION 

We investigated the relationship between gazing properties and drowsiness using features of the 
gazing parameter and the drowsiness condition using FEE scores during driving. Drowsiness is considered to 
be related to the human condition. The simplest way to determine the state of the human body is by directly 
asking the subject their present condition or having them and the examiner complete a drowsiness 
assessment. Several researchers have used facial expression evaluation (FEE) in questionnaires to obtain the 
physical condition, especially the drowsiness condition, of a subject. Moreover, previous studies have also 
evaluated the performance of the FEE (as a drowsiness evaluation tool). Consequently, it has been concluded 
that the fluctuation of FEE values represents the condition of the subject becoming drowsy. In this study, 
we observed the drowsiness condition of subjects during driving using an actual driving simulator. 
We obtained the gazing signal by using a head-mounted eye-tracking device, then extracted and analyzed the 
features of the gazing parameter using the FEE properties as our evaluation drowsiness method to 
evaluate drowsiness. 

Several studies have investigated drowsiness based on eye properties. For instance, Jackson et al. 
[34] investigated slow eyelid closure as a measure of driver drowsiness by measuring slow eye closure 
(PERCLOS) while drivers performed a simulated driving task. However, their study is still limited in its 
discussion of the parameters associated with the physical condition, especially the drowsiness condition. 
Moreover, they placed the camera in front of the driver, which does not provide a clear view for the driver 
and is unstable in terms of position. 

Ma’tougq et al. [35] used eye blinking to detect driver drowsiness. They proposed a device for 
monitoring a driver’s drowsiness by detecting and classifying the eye blinking into normal blinking (NB) or 
prolonged blinking (PB). However, they did not discuss the relationship of the parameters associated with the 
physical condition, especially the drowsiness condition. 

Wang and Xu [36] investigated drowsiness based on eye properties. They detected the drowsiness 
by using eye features: percentage of eye closure (PERCLOS), average pupil diameter, and blink duration 
combined with driving behavior parameters. They further used multilevel ordered logit (MOL), order logit 
(OL), and artificial network (ANN) to determine drowsiness in three drowsiness categories. The results of 
their study showed that the overall accuracy using MOL was 64.15 %, OL was 52.70 %, and ANN was 56.04 
% (MOL had the highest detection accuracy). Their study also confirmed that eye features performed better 
than driving behavior in drowsiness detection. It was confirmed by removing the eye features that the 
accuracy was reduced. However, their study has a lower accuracy in the detection of drowsiness than 
our study. 

Drowsiness has also been investigated based on eye properties using machine learning or 
classification methods. Hu and Zheng [37] used an SVM to classify the drowsiness condition in three 
categories with an overall accuracy of 80.74% in a driving simulator environment. They detected drowsiness 
via eyelid related parameters using EOG. Although they obtained a higher accuracy than that obtained in our 
study, their study has a drawback in that electrodes were attached to the driver, which could cause discomfort 
during driving. 

In this study, we used a head-mounted eye tracker to overcome view and position limitations, and to 
eliminate intrusion while extracting eye properties. This kind of investigation has rarely been conducted. 
To assess the drowsiness condition, we conducted a subjective evaluation of each subject’s physical 
condition using FEE. We focused on three categories for estimating the condition before actual drowsiness to 
prevent accidents. Only a few studies have been conducted on gazing properties related to drowsiness. 
A novel parameter was presented in this study. We found that the features of the gazing had significant 
statistical differences in three drowsiness categories: alert (FEE=0), lightly drowsy (FEE=1—2), and heavily 
drowsy (FEE=3-4). Several features of the gazing—gazing occurrence frames per epoch (GF), 
gazing occurrence clusters per epoch (GC), blink occurrence frames per epoch (BF), non-gazing occurrence 
clusters per epoch (NC), ratio of blinking frames versus clusters (RB = BF/BC), and the ratio of non-gazing 
frames versus clusters (RN = NF/NC)—had significant statistical differences with p < 0.001; Kruskal-Wallis 
test. Overall, seven features were sufficient to detect the drowsiness condition in three categories. 

Based on those results, an SVM was used to examine the performance of the features to classify the 
drowsiness condition. In the three categories, alert (FEE=0), lightly drowsy (FEE=1-—2), and heavily drowsy 
(FEE=3-—4), the SVM was able to detect the drowsiness with an overall accuracy of 76.3%. In addition, 
in two categories, alert (FEE=0) and drowsy (FEE=1-4), the SVM was able to detect the drowsiness with an 
overall accuracy of 89.0%. Note that the result of the classification is subject-dependent as it was calculated 
using each subject’s data. The classification results show that the features of gazing can be used to detect 
three drowsiness categories from the best features combination of each subject; these results were confirmed 
via the FEE questionnaire. 
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The features of the gazing parameter could be used as a new parameter or variable in drowsiness to 
obtain the characteristics and state of the eyes. We believe that these combinations effectively represent the 
aspects of the eye properties and could be used to determine the drowsy condition effectively. The eyes are 
commonly known to be a part of the human body that can clearly represent the human condition of being 
asleep or awake. Using the gazing properties, we can generally say that a human’s eyes easily become 
unfocused while starting to fall asleep or becoming drowsy when he/she starts getting sleepy or become 
drowsy. In contrast, a human’s gaze is focused when concentrating on a specific object. By considering those 
specific phenomena and using the gazing properties, we obtained useful parameters and showed that using 
the ratio of different parameters provided significant differences and could induce classification results 
between subjects as well as the gazing features themselves. Gazing parameters are composed of several 
features that could improve the estimation of drowsiness by combining specific parameters. 

However, our current study has the following limitation. In order to induce drowsiness, we asked 
each subject to drive in unrealistic conditions, such as on an oval track with no obstacles and no speed 
changes. In reality, people drive on various roads while controlling their vehicle’s speed and discerning 
signposts and other vehicles. In such a scenario with obstacles, the subject would have to look at the 
obstacles in order to drive safely. Consequently, we hypothesize that non-gazing would occur more 
frequently. We will examine whether our proposed features show the same results regardless of the scenario 
in future work. 


5. CONCLUSION 

In this study, a novel parameter and its features were proposed to detect drowsiness and statistical 
and classification techniques were used to quantify the performance of gazing properties representing several 
drowsiness condition levels. Our results indicate that the proposed gazing parameter can effectively assess 
the drowsiness level of a driver. 
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